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ABSTRACT 

This  paper  investigates  the  sensitivity  of  maximum  likelihood  estimates 
with  a  view  to  finding  out  how  many  individuals  are  needed  and  how  many  pur- 
chases are  required  for  each  individual  to  accurately  estimate  parameters  for 
zero  order  models.  Our  results  reveal  that  the  estimation  of  the  typically 
formulated  original  parameters  requires  about  2000  individuals  with  5 
purchases  per  consumer.  In  many  zero  order  applications,  however,  knowledge 
of  market  share  and  loyalty  index,  which  are  both  functions  of  the  original 
parameters,  should  be  adequate.  Reduced  sample  sizes  of  about  400  with  5 
purchase  records  per  household  are  shown  to  be  sufficient  to  estimate  the 
transformed  parameters,  the  market  share  and  the  loyalty  index.  Our  numerical 
results  use  the  beta  distribution  as  the  mixing  distribution  for  the  individual 
p  values;  however,  the  spirit  of  our  results  holds  for  arbitrary  mixtures. 
Namely,  much  smaller  sample  sizes  are  required  if  we  only  need  to  know  the 
location  (market  share)  and  shape  (loyalty  index)  of  the  mixing  distribution 
than  if  we  need  detailed  knowledge  of  the  original  parameters. 
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INTRQDUCTION 

Marketing  researchers  usually  provide  only  point  estimates  of  model 
parameters;  that  is,  they  do  not  take  into  account  the  effect  of  sampling 
variation  on  these  parameter  estimates.  There  are  only  a  few  exceptions 
(e.g..  Shoemaker  and  Staelin  [11],  Van  Mechelen  [13]).  In  both  these 
papers  the  authors  reached  the  same  conclusion,  which  is  that  the  para- 
meter estimates  of  the  models  studied  are  sensitive  and  that  the  coeffi- 
cients of  variation  for  commonly  used  sample  sizes  are  large. 

Shoemaker  and  Staelin  [11]  examined  the  effects  of  sampling  variation 
on  market  share  estimates  of  new  consumer  products  in  the  Parfitt  and 
Collins  model  [9].  Their  results  indicate  that  the  coefficients  of 
variation  associated  with  the  prediction  of  market  shares  for  normally 
used  sample  sizes  are  in  the  range  of  20%  to  40%.  Further,  the  sample 
size  of  over  2500  that  is  required  for  a  coefficient  of  variation  of  .1 
is  much  greater  than  that  commonly  used  to  estimate  market  share  with  the 
use  of  Parfitt  and  Collins  model  . 

Van  Mechelen  [13]  found  the  estimate  of  total  buyers  in  a  particular 
period  in  SPRINTER  mod  I  [12]  to  be  sensitive  to  small  variations  in  input 
data.  Specifically,  he  encountered  coefficients  of  variation  of  over  20% 
given  a  small  variance  of  3%  in  the  input  data. 

In  this  paper  our  objective  is  to  investigate  the  sensitivity  of 
maximum  likelihood  estimates  in  zero  order  models  with  a  view  to  determine 
sample  sizes  required  to  estimate  parameters  within  a  given  accuracy  level 
(say,  ±10%  of  the  true  value).  Zero  order  models  have  been  frequently  used 


to  describe  brand  switching  behavior  (Bass,  Jeuland,  and  Wright  [2], 
Kalwani  and  Morrison  [6],  and  Massy,  Montgomery,  and  Morrison  [8]).  Each 
consumer  in  these  models  is  assumed  to  purchase  a  given  brand  (say.  Brand 
1)  with  probability,  p,  and  Brand  0  representing  the  aggregate  "all  other"  class 
with  the  complementary  probability,  1-p.  These  models  allow  for  heterogeneity 
in  the  population  by  letting  the  p  values  differ  across  individuals.  Since 
each  person  is  a  zero  order  process  defined  by  a  single  parameter  p,  these 
zero  order  models  are  completely  defined  by  f(p),  the  distribution  of  p  values 
across  individuals  in  the  population.  A  statistical  question,  which  has 
received  little  attention  arises:  What  are  the  data  requirements  for 
accurately  estimating  f(p)?  In  this  paper  we  address  the  issue  of  how  many 
Individuals  nead  tu  be  in  the  i^awple  p1u&  how  ttiany  trials  need  te  be  observed 
for  each  individual  to  accurately  estimate  f(p). 

In  some  empirical  settings  the  exact  form  of  the  purchase  probability 
distribution,  f(p),  is  of  interest  and  we  need  to  estimate  the  original 
parameters  of  f(p)  (e.g.,  in  this  paper  this  would  amount  to  estimating  the 
original  parameters  of  the  beta  distribution).  Our  results  reveal  that  the 
sample  size  requirements  in  such  settings  are  excessive.  However,  in  most 
applications  of  zero  order  models  the  properties  of  interest  are  not  the 
parameters  themselves  but  functions  of  these  parameters  like  market  share  and 
a  measure  of  the  loyalty  or  switching  rate  (e.g.,  Hendry  seitching  constant 
[6],  Bass's  correlation  coefficient  [1],  and  Sabavala  and  Morrison's  loyalty 
index  [10]).  Our  findings  indicate  that  these  transformed  parameters  - 
market  share  and  loyalty  rate  -  are  "stable",  that  is,  they  vary  much  less 
than  the  original  parameters. 

The  sample  size  requirements  are  determined  for  a  coefficient  of 
variation  of  .05.  Assuming  that  the  parameter  estimates  are  normally 


distributed,  this  implies  that  the  true  parameters  are  estimated  within  +10 
percent  of  the  true  parameter  values  95  percent  of  the  time.  The  sample 
size  requirements  at  other  levels  of  accuracy  can  be  easily  calculated  from 
a  knowledge  of  the  sample  size  required  for  the  10  percent  accuracy  level. 
For  instance,  sample  size  will  have  to  be  quadrupled  to  improve  the  accuracy 
to  5  percent.  The  variance  of  the  estimator  is  inversely  proportional  to  the 
sample  size;  hence,  a  quadrupling  of  the  sample  size  is  needed  to  reduce  the 
standard  deviation  by  a  factor  of  2. 

Given  the  aforementioned  criterion  of  +10  percent  accuracy  level  at  95 
percent  confidence  level,  the  resulting  sample  sizes  required  for  the  es- 
timation of  the  original,  untransformed  model  parameters  exceed  2000  when 
only  5  purchase  records  are  available  for  every  household.  These  sample 
size  requirements  are  much  larger  than  most  researchers'  intuitions  would 
indicate  and  have  not  been  met  in  most  published  studies.  When  the  trans- 
formed, more  stable  parameters  (market  share  and  loyalty  rate)  are  estimated, 
however,  sample  size  requirements  given  5  purchases  per  consumer  are  less 
than  400  for  the  U  shaped  purchase  probability  distribution  typically 
encountered  in  empirical  research. 

The  remainder  of  this  paper  is  organized  as  follows:  first  we  discuss 
the  use  of  market  share  and  loyalty  rate  as  measures  of  market  response  in 
applications  of  zero  order  models.  This  is  followed  by  a  presentation  of 
the  overall  methodology  including  the  likelihood  expressions  for  estimation 
of  the  original  as  well  as  the  transformed  parameters.  Next,  the  findings 
from  the  simulated  data  are  reported.  Finally,  the  results  of  the  paper 
are  summarized  in  the  concluding  section. 


MEASURING  MARKET  RESPONSE 

As  mentioned  earlier,  knowledge  of  the  purchase  probability  distribution 
completely  defines  a  zero  order  model.     That  is,  various  response  measures 
like  market  share,  repeat-purchase  and  switching  probabilities--condi tional 
as  well   as  unconditional --can  be  easily  obtained  given  the  exact  form  of  the 
purchase  probability  distribution.     In  most  applications  of  zero  order  models, 
however,   the  marketing  researcher  may  only  be  interested  in  obtaining  the 
brand  share  and  a  measure  of  brand  loyalty  (e.g.,  Hendry  switching  constant 
[6],   Bass's  correlation  coefficient  [1] ,  and  Sabavala  and  Morrison's  loyalty 
index   [lO]).     It  turns  out  that,  assuming  beta  heterogeneity  on  p,  other 
response  measures  like  conditional   and  unconditional   switching  as  well   as    • 
repeat-purchase  probabilities  can  be  easily  obtained  from  a  knowledge  of  the 
market  shares  and  a  measure  of  brand  loyalty. 

The  beta  distribution  due  to  its  flexibility  to  take  different  shapes 
and  mathematical   tractability  is  often  used  to  represent  the  functional    form 
of  the  purchase  probability  distribution.     The  beta  distribution  takes  bell, 
U,  J,  or  reverse- J  shape  according  to  the  values  of  its  parameters,  a  and  3  . 
The  functional   form  of  the  beta  distribution  is  given  by 

f(p)     =    fj^Trla)  P^'^(l-P)^"^  forO>p>l,  a,3>0.  (1 

where  r  denotes  the  gamma  function.  The  mean  and  variance  of  the  beta  dis- 
tribution are 


EW     =    5?6- 
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The  unconditional   probability  of  repeat  buying  Brand  1   is  given  by 

1 

; 

0 
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The  unconditional  probability  of  switching  from  Brand  0  to  Brand  1  is 
given  by 

P(0,1)  =  /  (l-p)pf(p)dp  -   („,e)^^,B,l)   .  (3) 

The  conditional  probabilities  P(l|l)  and  P(0|1)  can  be  obtained  by  dividing 
the  two  unconditional  probabilities  in  equations  (2)  and  (3)  by  market 
share  of  Brand  1  which  for  a  beta  heterogeneity  is  a/a+g  .  In  essence,  then, 
switching  and  repeat  purchase  probabilities  can  be  easily  obtained  from  a 
knowledge  of  the  parameters  of  the  mixing  beta  distribution,  a  and  B.  in  the 
remainder  of  this  section,  we  show  that  the  conditional  as  well  as  the  un- 
conditional switching  and  repeat  purchase  probabilities  can  be  easily  ob- 
tained from  a  knowledge  of  the  market  share  and  loyalty  index. 

Kalwani  and  Morrison  [  6  ]  show  that  an  essential  property  of  the  Hendry 
System  is  that  switching  between  two  brands  i  and  j  will  be  proportional  to 
their  shares  S.  and  S.,  that  is 

'  J 

P(i,j)  =  y.Sj,  j^u  (4) 

where  K^  is  independent  of  i  and  j.  From  equation  (4),  it  is  easy  to  show 
that  the  repeat  purchase  probability  is  given  by 
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P(i,i)     =     S.     -       E  P(i,j)     =     S  -K  S   (1-S   )     .  (5) 


The  conditional   probabilities  of  purchasing  brand  i  on  the  second  purchase 
occasion  are  given  by 

jVi  J^'*' 

It  is  not  difficult  to  show  that  if  the  mixing  distribution  is  Dirichlet 
(multivariate  extension  of  beta)  then  equation  (4)  will   hold  (see  Bass,  Jeuland, 
and  Wright   [2]).     For  the  case  of  beta  heterogeneity,  Kalwani  and  Morrison  [6] 
find  that 

and  further,  relate  the  Hendry  switching  constant  to  Bass's  correlation 
coefficient   [1] 

where    p  is  the  correlation  of  successive  purchases  of  a  brand.    Sabavala    and 
Morrison    [10]  suggest  the  use  of  <})  =  l/(a+B+l)  as  a  measure  of  loyalty  and 
term  it  the  Loyalty  (or  Polarization)   Index.     They  state  that  (}>  is  a  measure 
of  the  strength  of  preference  for  and  against  a  brand.     A  purchase  probability 
distribution  concentrated  at  the  extremes,  p  =  0  and  p  =  1,  will   have  a  high 
value  of  <i)  . 

In  summary,  then,  a  marketing  researcher  may  wish  to  estimate  (j)(or  p,  or  K^ 
as  a  measure  of  loyalty  (or  switching)  rate.     Along  with  market  share,  denoted 


by  y),  ())  provides  estimates  of  switching  and  repeat  purchase  probabilities 
as  shown  in  equations   (4)   through  (6). 

In  this  paper,  we  determine  the  sensitivity  of  maximum  likelihood 
estimates  of  zero  order  models  using  a,B  and  y.cj)  parameterizations.     We 
perform  the  sensitivity  analysis  for  three  zero  order  models  which  cover 
different  shapes  of  the  purchase  probability  distribution,  namely,  bell, 
uniform,  and  U    .     These  three  models  are  displayed  in  Figure  1   along  with 
true  values  of  the  original  parameters     a,g  and  the  transformed  parameters 
y,(j)  . 


Model   #1 


Model   #2 


Model   #3 


f(p) 


f(p) 


a=3  =  2.0 

y=  0.5,())=  0.2 


0  p     p«  1 

a=6=  1.0 

y=  0.5,4)=  0.33 

METHODOLOGY 


-^^ 


a-- 6=  0.5 

y=  0.5,4)=  0.5 


In  this  section,  we  first  present  the  beta-binomial  distribution  to 
develop  an  expression  for  the  unconditional  probability  of  j  purchases  of 
Brand  1  on  k  trials.  Note  that  the  actual  purchase  records  of  consumers 
are  assumed  to  be  organized  in  a  frequency  distribution  form  giving  the 
number  of  households,  N.,  who  make  j  purchases  of  Brand  1  on  k  purchase 
occasions.  Given  this  data  and  the  expression  for  unconditional  probability 
of  j  successes  on  k  trials,  we  develop  the  likelihood  expressions  with 
a, Band  y,4)  parameterizations.  We  conclude  this  section  by  presenting 
our  data  simulation  procedure. 
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Beta-Binomial   Distribution 

Following  the  representation  of  zero  order  models  in  the  previous 
section,  for  a  consumer  who  has  a  purchase  probability  p  of  buying  Brand  1, 
the  number  of  purchases,  j,  of  Brand  1  on  k  trials  is  distributed  binomial 
with  parameters  p  and  k.     That  is, 

P(j;p,k)     =     (j)p'^'(l-p)''"^  O^j^k  . 

As  indicated  earlier  the  probability  of  purchasing  Brand  1   is  allowed  to 
vary  over  the  population  of  consumers  according  to  the  beta  distribution  with 
parameters  a  and  3   . 

f(p;a,e)     =     p°'"^(l-p)^"VB(a,3),  0<p<l ,  and  a3>0, 

where  B(a,B)     =     r(a)r(3)/r(a+3)   is  the  beta  function.     The  marginal   distribu- 
tion of  j  purchases  of  Brand  1  on  k  trials  is  obtained  by  compounding  the  bi- 
nomial and  beta  distributions.   The  unconditional   probability  of  j  purchases 
of  Brand  1  on  k  trials  is  given  by 

P(j;k,a,6)    =    ($)  ^^^^T[!tIT^  '       3  =  o,U..-X  (9) 

This  is  the  beta-binomial   distribution  whose  mean  and  variance  are  given  by 

and  UAR   Til   -      M(a+B+k) 

and        VAR  LjJ  -     {a+3)^(a+B+l )     • 

The  expression  for  the  unconditional   probability  of  j  successes  on  k  trials 
in  equation  (9)   can  also  be  written  in  terms  of  the  transformed  parameters 


y  and  ({)  by  making  the  substitutions  for  a  and  B  as  follows: 


,  =    lilMi  ,  and  6  =(1:m1(M) 


(10) 


Parameter  Estimation 

Given  the  purchase  frequency  data  (i.e.,  N-'s)   the  likelihood  function 

J 

can  be  written  as 

where  the  term  P.  represents  the  probability  of  j  purchases  of  Brand  1   on  k 
trials  and  its  value  is  given  in  equation   (9).     The  maximum  likelihood  es- 
timates of  a  and  3  can  be  obtained  by  maximizing  the  above  likelihood  expression 
with  respect  to  a  and  6.     Since  maximizing  a  monotonic  transform  of  £(.)  does 
not  change  the  values  of  the  maximum  likelihood  estimates,  the  constant  term 
in  l{.)  is  replaced  by  unity  and  for  computational   convenience  the  logarithm 
of  the  altered  likelihood  function  is  maximized.     The  transformed  log-likelihood 
function  L(.),  is  given  by 

/        |\|  |\|  Nik 

L(NQ,Np...,N^;a,6,k)=logWQ)  °(P^)   ^•••{P\^)   >  .^  ^j^^^^^       (^2) 

Substituting  for  P  .  from  equation  (9)  and  expressing  the  beta  functions  in 
terms  of  the  gamma  functions,  the  above  expression  simplifies  to  the  following 
log- likelihood  function 
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k      pj-1  k-j-1  k-1  -1 

L(Nf,,N,  ,...,N.  ;a,6,k)  =     EN.     Z  log(a+r)+    Z     log(6+r)-  S  log(a+B+r)      .      (13) 

j=0  -^^=0  r=0  r=0  -1 

Substituting  for  a  and  g  in  terms  of  y  and  (f)   ,   the  above  log-likelihood  function 
can  be  rewritten  as 

k      f-j-l  k-j-1  J 

L(N(^,N^,...,N.  ;y,((),k)  =     E  Nj    E  log(y(l-*)+r(}))+    E     log((l-y)(l-4))+rtf))  -  \ 

'  ^  3=0  «^r=0  r=0 

k-1  -r 

E  log(l-(|)+r(f))     .     (14) 
r=0  J 

The  maximum  likelihood  estimates  of  y  and  cj)  can  now  be  found  by  maximizing 
the  log-likelihood  function  in  equation   (14)  with  respect  to  y  and  <{)   . 

The  likelihood  functions  L(,)  given  in  equations  (13)  and  (14)  are  quite 
complex  and  closed  form  analytical   solutions  are  not  available.     Therefore 
numerical  optimization  is  used  to  determine  the  maximum  likelihood  estimates 
of  the  parameters.     The  computer  program  used  for  this  purpose,  namely, 
"modified  pattern  search,"  is  based  on  the  pattern  search  procedure  developed 
by  Hookes  and  Jeeves  [3] .     The  program  provides  a  general  optimization  pro- 
cedure for  any  function  with  a  vector  of  n  parameters.     It  was  developed  for 
Kalwani's  doctoral   dissertation  and  is  explained  in  detail   in  Kalwani  [4]. 

Simulated  Data  Generation 

Simulated  data  are  used  to  determine  sample  size  and  purchase  sequence 
length  requirements  for  estimating  the  model  parameters.  The  first  step  in 
the  procedure  is  to  generate  50  samples  of  size  N--equal   to  100,  300,  or  500-- 
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and  a  purchase  sequence  length  k--equal  to  5,  10,  or  20.  In  other  words, 
for  each  of  the  three  zero  order  models  displayed  in  Figure  1,  50  samples 
are  generated  for  nine  (3  values  of  N  times  3  values  of  k)  different  sample 
specifications  starting  with  purchase  sequence  length  of  5  with  sample  size 
of  100  and  ending  with  purchase  sequence  length  of  20  with  sample  size  of 
500. 

The  output  of  each  sample  specification  (say,  sample  size  =  300  and 
purchase  sequence  length  =  10)  for  each  of  the  50  simulations  is  a  purchase 
frequency  distribution  which  gives  the  number  of  consumers,  N  .,  who  make  j 
(where  j  =  0,1,..., k)  purchases  of  Brand  1  on  k  choice  occasions. 

The  second  step  in  the  procedure  involves  the  estimation  of  the  model 

parameters-- a, e  or  \i,(i,   --for  each  of  the  50  simulated  purchase  frequency  dis- 
tributions. Next  the  means,  standard  deviations,  and  coefficients  of  varia- 
tion of  the  maximum  likelihood  estimates  from  the  50  simulations  are  computed. 
This  second  step  is  implemented  for  each  of  the  nine  N,k  sample  specifications, 

FINDINGS 

This  section  contains  the  findings  from  the  Monte  Carlo  simulation 
runs  for  each  of  the  three  zero  order  models  displayed  in  Figure  1.  The 
results  reported  here  cover  three  purchase  sequence  lengths--5,  10,  and  20. 
Note  that  for  many  frequently  bought  grocery  and  household  items,  5  purchases, 
for  many  households,  would  cover  a  quarter,  10  would  extend  over  half  a  year, 
and  20  would  span  a  year.  The  final  sample  size  requirements  for  each  of 
these  three  purchase  sequence  lengths  are  based  on  combining  our  findings 
on  sample  size  requirements  using  sample  sizes  of  100,  300,  and  500. 
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Since  the  U  shaped  purchase  probability  distribution  has  been  found  to 
fit  empirical  data  well  (see  Kalwani  and  Morrison  [7]),  we  use  Model  #3 
(see  Figure  1)  to  illustrate  our  findings.  Table  1  displays  coefficients 
of  variation  associated  with  estimation  of  the  parameters  of  Model  #3.  In 
the  a, 3  parameterization  the  two  parameters  have  equal  stability  and  the 
coefficients  of  variation  displayed  in  Table  1  are  averages  of  those  for 
a  and  g.  In  case  of  the  y,(j)  parameterization  the  parameter  y  is  more 
stable  than  the  parameter  (j).  The  coefficients  of  variation  of  the  parameter 
(|)  displayed  under  the  \i,<^   parameterization  are  about  four-thirds  of  the 
coefficients  of  variation  of  the  y  parameter.  Therefore,  it  is  the  <^ 
parameter  which  determines  the  sample  size  requirements  and  Table  1  contains 
coefficients  of  variation  of  (j). 

INSERT  TABLE  1  HERE 

The  coefficients  of  variation  displayed  in  Table  1  as  well  as  those 
for  the  other  two  models  were  found  to  be  consistent  with  the  "inverse 
square  root  of  n  relationship"  which  provides  a  check  on  the  reliability 
of  the  modified  pattern  search  program  for  obtaining  the  estimates  of  the 
model  parameters.   That  is,  the  coefficient  of  variation  obtained  for  a 
sample  size  n  times  larger  than  another  sample  is  (V/rT)  times  the  co- 
efficient of  variation  for  the  latter  sample. 

The  "inverse  square  root  of  n  relationship"  can  be  used  to  obtain  the 

sample  sizes  required  for  any  desired  level  of  accuracy  in  parameter  estima- 

2 
tion.   Table  2  displays  the  sample  sizes  required  for  various  models  when 

the  purchase  sequence  lengths  of  the  simulated  samples  are  5,  10  and  20. 
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These  sample  sizes  have  been  obtained  for  a  coefficient  of  variation  of  .05. 
The  implication  of  selecting  this  ratio  as  5%,   is  that  95%  of  the  parameter 
estimates  can  be  expected  to  fall  within  10%  of  the  true  parameter  value 
(i.e.,  in  the  range  ±0.1%). 

INSERT  TABLE  2  HERE 

An  examination  of  the  findings  displayed  in  Table  2  reveals  that  it  is 
easier  to  estimate  parameters  for  a  U  shape  purchase  probability  distribution 
(Model  #3)  than  for  a  uniform  (Model  #2)  or  bell  shape  (Model  #1).  This 
result  holds  good  across  both  the  a,e  and  p,(|)  parameterizations. 

More  importantly,  however,  the  results  displayed  in  Table  2  reveal  that 
the  sample  sizes  required  for  estimating  the  original  parameters  a  and 
e  within  ±10%  of  their  true  values  are  excessive.  Even  in  case  of  the 
"easiest  to  estimate"  U  shape  purchase  probability  distribution  a  panel  size 
of  about  2000  is  required  given  5  purchases  per  each  household.  On  the  other 
hand,  the  sample  size  requirements  are  considerably  smaller  in  case  of  the 
p.if)  parameterization.  This  indicates  that  the  transformed  parameters  y  and 
(|)  are  more  stable  than  the  original  parameters  a   and  g.  Therefore,  in 
applications  where  knowledge  of  the  market  share,  y  and  the  loyalty  index 
Cor  Hendry  switching  constant,  Bass's  correlation  coefficient)  would  suffice, 
it  is  much  more  efficient  to  estimate  them  directly  rather  than  computing 
them  indirectly  from  estimates  of  a  and  g. 

Many  researchers  in  the  past  have  used  sample  data  with  purchase  se- 
quence lengths  of  5  or  less  when  estimating  the  original  parameters  of 
the  purchase  probability  distribution.  To  the  extent  that  inadequate 
sample  sizes  have  been  used  to  estimate  model  parameters,  the  results  ob- 
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tained  in  such  models  are  suspect.  Note,  however,  that  in  case  of  Model 
#3  a  sample  size  of  400  with  5  purchase  records  per  household  is  sufficient 
to  estimate  the  transformed  parameters  y  and  <^   within  ±10%. 

Two  cautionary  comments  are  in  order.  Since  maximum  likelihood  es- 
timates are  invariant,  we  can  obtain  maximum  likelihood  estimates  of  the 
original  parameters  a  and  B  from  maximum  likelihood  estimates  of  the 
transformed  parameters  y  and  (j)  (see  equation  (10)).  Note,  however,  that 
the  accuracy  levels  of  the  estimates  of  a  and  3  will  in  general  be 
different  from  those  of  the  transformed  parameters  y  and  (j).  Specifically, 
while  reduced  sample  sizes  are  needed  to  estimate  y  and  <\)   (say,  within  ±10%) 
by  searching  in  the  y,*  space,  the  estimates  of  a,e  obtained  therefrom  will 
be  less  accurate  than  estimates  of  y  and  ^.     Therefore,  in  zero  order 
applications  where  estimates  of  a   and  b  are  needed  within  ±10%,  the  prior 
search  in  the  y,({.  space  does  not  help  in  reducing  the  data  requirements:  the 
results  displayed  in  Table  2  under  the  a, 3  parameterization  still  hold  good 
and  indicate  the  sample  size  requirements. 

Our  second  caveat  is  in  connection  with  the  effect  of  the  market  share 
of  the  brand  under  consideration  namely,  Brand  1,  on  the  sample  size  require- 
ments. We  have  obtained  data  requirements  for  three  different  zero  order 
models  keeping  the  brand  share  constant  at  50%.  Obviously,  the  sample  size 
requirements  will  be  larger  for  smaller  shares  of  Brand  1.  Our  findings  else- 
where confirm  this  intuitively  appealing  proposition  (Kalwani  and  Morrison  [5]), 
This  increase  in  sample  size  requirements  for  the  three  purchase  sequence 
lengths  is  small  -  around  10%  to  20%  -  for  the  initial  reduction  in  share  of 
Brand  1  from  50%  to  25%.  However,  as  the  share  of  Brand  1  reduces  to  around 
12%,  the  increase  in  sample  size  requirement  is  around  50%  to  100%.  The 
reader  is  referred  to  Kalwani  and  Morrison  [5]  for  further  details. 
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CONCLUSIONS 

In  this  paper,  the  sensitivity  of  maximum  likelihood  estimates  in  zero 
order  models  was  investigated  with  a  view  to  determine  sample  sizes  re- 
quired to  estimate  parameters  within  a  given  accuracy  level  (within  ±10% 
of  the  true  value).  We  used  simulated  data  from  bell,  uniform,  and  U 
shaped  purchase  probability  distributions  to  determine  sample  size  re- 
quirements for  three  purchase  sequence  lengths,  namely  5,  10,  and  20. 
Our  primary  conclusion  is  that  the  maximum  likelihood  estimates  of  the 
original  parameters  of  the  purchase  probability  distribution  are 
sensitive  and  that  the  coefficients  of  variation  for  commonly  used  sample 
sizes  are  large.  Specifically,  our  findings  reveal  that  for  the  U 
shaped  purchase  probability  distribution  which  is  the  model  that  fits 
empirical  data  well,  the  sample  size  required  with  a  purchase  sequence 
length  of  5  is  around  2000. 

In  many  zero  order  applications,  however,  knowledge  of  the  market 
share  and  loyalty  index  (or  Hendry  switching  constant,  or  Bass's 
correlation  coefficient)  may  suffice.  We  demonstrated  that,  assuming 
zero  order  process  and  beta  heterogeneity,  conditional  as  well  as  un- 
conditional switching  and  repeat  purchase  probability  can  be  easily 
obtained  from  a  knowledge  of  the  market  share  and  loyalty  index.  From  a 
statistical  viewpoint  we  found  that  it  is  much  more  efficient  to  estimate 
these  two  transformed  parameters  rather  than  the  original  parameters. 
Specifically,  for  the  U  shaped  purchase  probability  distribution  sample 
size  of  400  given  5  purchases  per  household  was  found  to  be  adequate  to 
estimate  these  two  transformed  parameters  within  10%  of  their  true  value. 


-16- 


FOOTNOTES 


The  reliability  of  the  numerical  optimization  program  is  crucial  to 
the  accuracy  of  sample  size  requirements  developed  in  this  paper.  It 
should  be  pointed  out  that  the  modified  pattern  search  program  has 
been  checked  "jery   thoroughly  at  both  Columbia  University  and  M.I.T. 
Several  tests  were  carried  out  to  measure  the  magnitude  of  the 
numerical  error  in  our  computer  program.  The  number  of  consumers 
who  make  j  purchases  of  Brand  1  and  (k-j)  purchases  of  Brand  0 
were  obtained  theoretically  using  equation  (9).  This  was  done  for 
a  variety  of  parameter  values  and  sample  sizes.  The  modified  pattern 
search  program  was  then  used  to  estimate  the  known  parameter  values 
which  were  used  to  generate  the  theoretical  data  in  the  first  place. 
The  true  parameter  values  were  reproduced  within  an  accuracy  of  1 
in  10,000. 

To  illustrate  the  computation  of  sample  size  requirements  in 
Table  2,  consider  the  entry  for  purchase  sequence  length  of  5 
in  case  of  Model  #3  with  a,e  parameterization.  The  sample  size 
requirement  of  2005  represents  an  average  based  on  three  figures 
from  Table  1  as  shown  below: 


^(^)2  (100)  +  y(7^)^  (300)  +  ^^)^  (500)  =  2005. 
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Table  1 

COEFFICIENTS  OF  VARIATION*  FOR  DIFFERENT  SAMPLE 
SPECIFICATIONS  OF  MODEL  *3 


^ 


Parameterization 

SAMPLE 

a,B 

y.'f' 

SIZE 

Purchase  Sequence  Length 

Purchase  Sequence  Length 

5 

10 

20 

5 

10 

20 

100 

.214 

.180 

.194 

.095 

.076 

.077 

300 

.134 

.107 

.086 

.059 

.047 

.037 

500 

.101 

.081 

.075 

.045 

.037 

.032 

*In  case  of  the  a,g  parameterization  the  numbers  in  the  table  represent  an 
average  of  the  coefficient  of  variations  of  the  estimates  of  a  and  g.  In 
case  of  the  y,(t)  parameterization,  since  the  parameter  if   is  less  stable  than 
the  parameter  y,  the  numbers  in  the  table  represent  coefficient  of  variation 
of  the  estimate  of  i). 
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Table  2 

bAMPLt  SIZES  RtqUIRED  FOR  MEETING 
THE  ACCURACY  CRITERION  OF  ±10% 


Parameterization 

Model  * 

a, 3 

y.<f> 

Purchase  Sequence  Length 

Purchase  Sequence  Length 

5 

10 

20 

5 

10 

20 

Model  #1 

3563 

2865 

1496 

1822 

1332 

708 

Model  #2 

2626 

1795 

1158 

903 

627 

371 

Model  #3 

2005 

1328 

1176 

394 

254 

201 

*  For  Model  #1:  a=3=2;  for  Model  #2:  a=B=1.0;  for  Model  #3:  a=B=0.5. 
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